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Abstract 



Wc consider the symmetric rendezvous search game on a complete graph of n locations. 
In 1990, Anderson and Weber proposed a strategy in which, over successive blocks of n— 1 
. steps, the players independently choose either to stay at their initial location or to tour 

the other n — 1 locations, with probabilities p and 1 — p, respectively. Their strategy has 
been proved optimal for n = 2 with p = 1/2, and for n = 3 with p = 1/3. The proof for 
n — 3 is very complicated and it has been difficult to guess what might be true for n > 3. 
Anderson and Weber suspected that their strategy might not be optimal for n > 3, but 
they had no particular reason to believe this and no one has been able to find anything 
better. This paper describes a strategy that is better than Anderson-Weber for n = 4. 
j ' However, it is better by only a tiny fraction of a percent. 

O 

; 1 The Anderson— Weber strategy 

CT\ \ In the symmetric rendezvous search game on K n (the completely connected graph on n 

vertices) two players are initially placed at two distinct vertices (called locations). The game 
is played in discrete steps and at each step each player can either stay where he is or move 
to a different location. The players share no common labelling of the locations. Our aim is 
to find a (randomizing) strategy such that if both players independently follow this strategy 
then they minimize the expected number of steps until they first meet. Rendezvous search 
games of this type were first proposed by Steve Alpern in 1976. They are simple to describe, 
and have received considerable attention in the popular press as they model problems that 
are familiar in real life. They are notoriously difficult to analyse. 

The Anderson-Weber strategy is a mixed strategy that proceeds in blocks of n — 1 steps. 
Players begin at distinct locations, called their home locations. In each successive block a 
player either stays at his home location, with probability p, or makes a randomly chosen tour 
of his n — 1 non-home locations, doing this with probability 1 — p. The motivation for the 
strategy comes from the wait-for-mommy strategy that is optimal in an asymmetric version 
of the problem. With probability 2p(l — p) the players play the wait-for-mommy strategy 
over the first n — 1 steps and so rendezvous in expected time (n + l)/2. 

Anderson and Weber (1990) proved that the above strategy is optimal for the game on 
K2, with p = 1/2, and conjectured that it should be optimal for K3, with p = 1/3. This 
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was finally proved by Weber (2006), who established a strong AW property (SAW) that AW 
minimizes ^[minjT, k}] for all k. Anderson and Weber suspected that their strategy might 
not be optimal for n > 3, but they had no particular reason to believe this and no one has 
been able to find any strategy that is better. Indeed, AW has been shown optimal amongst 
2-Markov strategies. Fan (2009) showed that AW minimizes P(T > 2) and E[mm{T, 2}]. 
He also found that AW is not optimal on K4 if players have the extra information that the 
location can be viewed as being arranged on a circle and the players are given a common 
notion of clockwise. However, the question as to whether or not AW is optimal has remained 
open for the case in which there is no such special extra information. Fan writes, 'The author 
believes that SAW still holds on K4, and so AW strategy is still optimal'. We were inclined 
to agree, but now find that AW can be bettered. For more background to the problem see 
Weber (2006). 

Let us begin by reprising the AW strategy for the symmetric rendezvous game on K4. We 
assume that there is no special knowledge (such as a common notion of clockwise on a circle). 
The AW strategy is a 3-Markov strategy that repeats in blocks of 3 steps. In each successive 
block of 3 steps, each player, independently, remains at his home location with probability p, 
or does a random chosen tour of his 3 non-home locations, with probability 1 — p. This leads 
to rendezvous in an expected number steps ET, where 

ET = p 2 x (3 + ET) + 2p(l -p)x2+(l-p) 2 x (§(16/9) + |(3 + ET)) 
43 - Up + 25p 2 
~ 9 (1 + 2p - 3p 2 ) ' 

This is explained as follows. 

1. If both stay home they do not meet. 

2. If one stays home, while the other tours, then they meet in expected time 2. 

3. If both tour, then they meet with probability 1/2, and conditional on meeting they 
meet in expected time 16/9. 

One easily finds that the minimum of ET is achieved by taking 

p = - (3V68I - 77) « 0.321983 

and then 

ET = — ( 15 + Vm) ps 3.42466 . 



2 A strategy better than Anderson— Weber on 

We now explain how the AW strategy can be bettered. Suppose player I has location 1 as 
his home, and player II has location 2 as home. We might imagine that each player labels his 
non-home locations as a, b, c, and so a tour of his non-home locations is one of six possible 
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tours: abc, acb, bac, bca, cab, cba. In the case that player I has (a,b,c) = (2,3,4) and player 
II has (a, b, c) = (1, 3, 4) we can compute the matrix 



B = 



/ 2 X3 X X 2 \ 

X 2 X 2 3 X 

3 X 1 1 X X 

X 2 1 1 X X 

X 3 XXI 1 

V 2 XXXI 1 J 



where we have ordered the rows and columns to correspond to abc, acb, bac, bca, cab, cba. A 
number entry indicates the step at which players meet when they meet, and X indicates that 
they do not meet. There are 36 such matrices, over which we must average, for each possible 
pair of assignments by players I and II, of (2,3,4) and (1,3,4), respectively, to (a,b,c). 

Let us begin by noting that if a player stays home for three steps and meeting does not 
occur, then the other player must also have been staying home. Similarly, if a player tours for 
three steps and meeting does not occur, then the other player must also have been touring 
(and their tours not meeting). Thus after any 3k steps (a multiple of 3) each player knows 
exactly how many times both have been touring. 

Whenever a player makes a tour in the AW strategy he chooses his tour at random 
(independently of previous tours). We show how to improve AW introducing some dependence 
between tours. Let us adopt a notation in which the first tour a player makes is labelled A. 
The second distinct tour a player makes is labelled B, and so on. So, for example, AAB 
means that on his first three tours, a player (i) first makes a random tour, (ii) second makes 
the same tour as his first tour, (iii) and third makes a tour chosen randomly from amongst 
the 5 tours he has not yet tried. 

Let us consider first a modified problem in which at each so-called 't-step' each player 
makes a tour of his non-home locations. In this modified problem no player stays home for a 
t-step. We wish to minimize the expected number of t-steps until the players meet. At the 
first t-step both players do A and the probability of meeting is 1/2. If a 1-Markov strategy 
is employed, so successive t-steps are chosen at random, then the expected number of t-steps 
until meeting occurs is 2. 

Over the first two t-steps, the players can do either A A or AB. The matrix for not 
meeting is 



One can check that Pi y (i.e., P2 is positive definite), thus for a 2-Markov strategy we 
would be solving 



where J is a 2 x 2 matrix filled with Is. This has a minimum value of ET = 2, when we take 
x T = (1/6,5/6). This means that, restricting to 2-Markov strategies, tours should be chosen 
at random. 

Similarly, over the first three t-steps, the players can do AAA, AAB, ABA, ABB, ABC. 




ET = x 1 (J + P 2 ET)x 
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The matrix for not meeting is 
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Again, P 3 y 0, and (xaaa,xaab,xaba,xabb,xabc) 
optimal in the sense of minimizing the solution of 



/ 



(1/36,5/36,5/36,5/36,20/36) is 



ET 



(J + P 2 + P 3 ET)x, 



where J is now 5x5 and P 2 is expanded to the appropriate 5x5 matrix. Thus amongst 
3-Markov strategies, tours should also be chosen at random. 

However, over four t-steps things turn out differently. There are now 15 possible strategies: 
AAAA, AAAB, AABA, AABB, AABC, ABAA, ABAB, ABAC, ABBA, ABBB, ABBC, 
ABC A, ABCB, ABCC, ABCD. The matrix for not meeting can be computed to be 
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It now turns out that P4 has a negative eigenvalue. The AW strategy would be to choose 
tours at random, which gives 

PAAAA , PAAAB , PAABA , PAABB , PAABC , PABAA , PABAB , PABAC 1 PABBA , PABBB , 
PABBC j PABCA , PABCB , PABCC > PABCD 

1 



(1, 5, 5, 5, 20, 5, 5, 20, 5, 5, 20, 20, 20, 20, 60) 
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Solving ET = x T (J + P2 + P3 + P4 ET)x, we find ET = 2, as we expect. However consider 



3 4'^ ' 'B' ' ' 'B' ' ' '°'i 



Solving ET = y T {J + P 2 + P3 + P 4 £T)y gives £T = 2 — = 1.99858. Thus, rendezvous 
occurs in a smaller expected number of t-steps than it does under AW. This happens when 
players use a mixed 4-Markov strategy of doing AAAB, A ABA, ABAA, ABBB, each with 
probability 1/12, and ABCD with probability 2/3. This corresponds to choosing tours for 
the first two t-steps at random, but then making the choice of tours at the 3rd and 4th t-step 
depend on the tours taken at the 1st and 2nd t-step. The choice of y is not unique. It has been 
choose to be simple, containing many Os, and it was found by using the fact that the eigen- 
vector of P4 having a negative eigenvalue is of a pattern (a, (3, (3, 7, 6, (3, 7, 6, 7, f3, S, S, 6, 6, e) 
for some irrational a, 0, 7, 5, e. 

The above makes it very plausible that we can find a strategy that is better than AW 
on K4. We now need to do some careful calculations. We consider a 12-Markov strategy 
consisting of 4 t-steps. In each t-step a player remains home with probability p, and tours 
with probability 1 — p. When he makes tours, he does so in an manner that achieves the 
distribution previously described. That is, any 1st and 2nd tours are made at random, but 
3rd and 4th tours are made so that these are consistent with the distribution over 4 tours being 
AAAB, AABA, ABAA, ABBB, each with probability 1/12, and ABCD with probability 
2/3. If at the end of 12 steps the players have not met then the strategy restarts, forgetting 
about the number of previous t-steps on which players made non-meeting tours. 

We found it easiest to calculate the expected meeting time by attaching a probability to 
each possible 12-step paths that the strategy might take. There are 1585 possible paths which 
have nonzero probability. We computed the step at which players meet, or event that they do 
not meet, for each of the 1585 x 1585 possibilities, and averaged these using the appropriate 
probabilities. The calculations are intricate, but can be checked in various ways to provide 
confidence that no mistake has been made. It turns out that the expected meeting time is 

grp _ -227773p 8 + 582884p 7 - 1329319p 6 + 1737938p 5 - 1941235p 4 + 1420688p 3 - 998569p 2 + 389834p - 217648 
3 (82001p 8 - 218608p 7 + 327728p 6 - 315256p 5 + 215870p 4 - 104656p 3 + 36128p 2 - 8008p - 15199) ' 

Taking p = \ (3V 681 — 77) , which is the optimal value for the AW strategy, we find that 
the new strategy produces an expected meeting time that is less than that of AW by 



243 75041961207 + 4700853101 V681 

i '- w 0.000146683 . 

327540887401488016 

The tiny improvement is due to the fact that when both players do four t-tours (which happens 
with probability (1 — p) 4 ), the new strategy gives a greater probability that the players meet 
than does AW. It would be possible to make the new strategy even better, by choosing p 
slightly differently, or indeed making it depend on the number of tours that have been taken 
so far over which players have not met. We could also do better by not restarting after 12 
steps. However, our aim is not to try to find the best strategy for K4, which still seems very 
difficult, but simply to show that AW is not optimal. This we have now done. 
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